AITopics

Country: Asia > China (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-18-2026, 12:42:16 GMT

e6e706454d72c18582b9c1ff70b11f7d-Paper-Conference.pdf

data mining, information, machine learning, (18 more...)

Country:

Europe > United Kingdom (0.14)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Neural Information Processing SystemsDec-24-2025, 17:53:19 GMT

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing. Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem. However, there is little explanation of why it works empirically from the viewpoint of learning theory. In this study, we derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs. Using the boosting theory, we prove the convergence of the training error under weak learning-type conditions. By combining it with generalization gap bounds in terms of transductive Rademacher complexity, we show that a test error bound of a specific type of multi-scale GNNs that decreases corresponding to the number of node aggregations under some conditions. Our results offer theoretical explanations for the effectiveness of the multi-scale structure against the over-smoothing problem. We apply boosting algorithms to the training of multi-scale GNNs for real-world node prediction tasks. We confirm that its performance is comparable to existing GNNs, and the practical behaviors are consistent with theoretical observations.

multi-scale graph neural network, optimization and generalization analysis, transduction, (8 more...)

Genre: Research Report > New Finding (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Neural Information Processing SystemsOct-10-2025, 19:59:12 GMT

HC-GAE: The Hierarchical Cluster-based Graph Auto-Encoder for Graph Representation Learning Lu Bai

Graph Auto-Encoders (GAEs) are powerful tools for graph representation learning.

graph, information, representation, (16 more...)

Country:

Europe > United Kingdom (0.14)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-16-2025, 19:01:22 GMT

dab49080d80c724aad5ebf158d63df41-Paper.pdf

algorithm, international conference, proceedings, (10 more...)

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Information Management (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceAug-13-2025

HSA-Net: Hierarchical and Structure-Aware Framework for Efficient and Scalable Molecular Language Modeling

Shao, Zihang, Lei, Wentao, Wang, Lei, Ye, Wencai, Liu, Li

Molecular representation learning, a cornerstone for downstream tasks like molecular captioning and molecular property prediction, heavily relies on Graph Neural Networks (GNN). However, GNN suffers from the over-smoothing problem, where node-level features collapse in deep GNN layers. While existing feature projection methods with cross-attention have been introduced to mitigate this issue, they still perform poorly in deep features. This motivated our exploration of using Mamba as an alternative projector for its ability to handle complex sequences. However, we observe that while Mamba excels at preserving global topological information from deep layers, it neglects fine-grained details in shallow layers. The capabilities of Mamba and cross-attention exhibit a global-local trade-off. To resolve this critical global-local trade-off, we propose Hierarchical and Structure-Aware Network (HSA-Net), a novel framework with two modules that enables a hierarchical feature projection and fusion. Firstly, a Hierarchical Adaptive Projector (HAP) module is introduced to process features from different graph layers. It learns to dynamically switch between a cross-attention projector for shallow layers and a structure-aware Graph-Mamba projector for deep layers, producing high-quality, multi-level features. Secondly, to adaptively merge these multi-level features, we design a Source-Aware Fusion (SAF) module, which flexibly selects fusion experts based on the characteristics of the aggregation features, ensuring a precise and effective final representation fusion. Extensive experiments demonstrate that our HSA-Net framework quantitatively and qualitatively outperforms current state-of-the-art (SOTA) methods.

large language model, machine learning, natural language, (18 more...)

2508.08334

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.67)

Neural Information Processing SystemsOct-11-2024, 13:05:10 GMT

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

multi-scale gnn, multi-scale graph neural network, optimization and generalization analysis, (5 more...)

Genre: Research Report > New Finding (0.43)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.40)

arXiv.org Artificial IntelligenceJul-1-2024

Bridging Smoothness and Approximation: Theoretical Insights into Over-Smoothing in Graph Neural Networks

Yang, Guangrui, Li, Jianfei, Li, Ming, Feng, Han, Zhou, Ding-Xuan

In this paper, we explore the approximation theory of functions defined on graphs. Our study builds upon the approximation results derived from the $K$-functional. We establish a theoretical framework to assess the lower bounds of approximation for target functions using Graph Convolutional Networks (GCNs) and examine the over-smoothing phenomenon commonly observed in these networks. Initially, we introduce the concept of a $K$-functional on graphs, establishing its equivalence to the modulus of smoothness. We then analyze a typical type of GCN to demonstrate how the high-frequency energy of the output decays, an indicator of over-smoothing. This analysis provides theoretical insights into the nature of over-smoothing within GCNs. Furthermore, we establish a lower bound for the approximation of target functions by GCNs, which is governed by the modulus of smoothness of these functions. This finding offers a new perspective on the approximation capabilities of GCNs. In our numerical experiments, we analyze several widely applied GCNs and observe the phenomenon of energy decay. These observations corroborate our theoretical results on exponential decay order.

graph, k-functional, neural network, (15 more...)

2407.01281

Country:

Asia > China > Hong Kong (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceMar-22-2024

GTC: GNN-Transformer Co-contrastive Learning for Self-supervised Heterogeneous Graph Representation

Sun, Yundong, Zhu, Dongjie, Wang, Yansong, Tian, Zhaoshuo

Graph Neural Networks (GNNs) have emerged as the most powerful weapon for various graph tasks due to the message-passing mechanism's great local information aggregation ability. However, over-smoothing has always hindered GNNs from going deeper and capturing multi-hop neighbors. Unlike GNNs, Transformers can model global information and multi-hop interactions via multi-head self-attention and a proper Transformer structure can show more immunity to the over-smoothing problem. So, can we propose a novel framework to combine GNN and Transformer, integrating both GNN's local information aggregation and Transformer's global information modeling ability to eliminate the over-smoothing problem? To realize this, this paper proposes a collaborative learning scheme for GNN-Transformer and constructs GTC architecture. GTC leverages the GNN and Transformer branch to encode node information from different views respectively, and establishes contrastive learning tasks based on the encoded cross-view information to realize self-supervised heterogeneous graph representation. For the Transformer branch, we propose Metapath-aware Hop2Token and CG-Hetphormer, which can cooperate with GNN to attentively encode neighborhood information from different levels. As far as we know, this is the first attempt in the field of graph representation learning to utilize both GNN and Transformer to collaboratively capture different view information and conduct cross-view contrastive learning. The experiments on real datasets show that GTC exhibits superior performance compared with state-of-the-art methods. Codes can be available at https://github.com/PHD-lanyu/GTC.

artificial intelligence, information, machine learning, (16 more...)

2403.1552

Country:

Asia > China > Heilongjiang Province > Harbin (0.04)
Europe > Greece (0.04)
Europe > Austria (0.04)
Asia > China > Shandong Province (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceFeb-10-2024

Cross-Space Adaptive Filter: Integrating Graph Topology and Node Attributes for Alleviating the Over-smoothing Problem

Huang, Chen, Li, Haoyang, Zhang, Yifan, Lei, Wenqiang, Lv, Jiancheng

The vanilla Graph Convolutional Network (GCN) uses a low-pass filter to extract low-frequency signals from graph topology, which may lead to the over-smoothing problem when GCN goes deep. To this end, various methods have been proposed to create an adaptive filter by incorporating an extra filter (e.g., a high-pass filter) extracted from the graph topology. However, these methods heavily rely on topological information and ignore the node attribute space, which severely sacrifices the expressive power of the deep GCNs, especially when dealing with disassortative graphs. In this paper, we propose a cross-space adaptive filter, called CSF, to produce the adaptive-frequency information extracted from both the topology and attribute spaces. Specifically, we first derive a tailored attribute-based high-pass filter that can be interpreted theoretically as a minimizer for semi-supervised kernel ridge regression. Then, we cast the topology-based low-pass filter as a Mercer's kernel within the context of GCNs. This serves as a foundation for combining it with the attribute-based filter to capture the adaptive-frequency information. Finally, we derive the cross-space filter via an effective multiple-kernel learning strategy, which unifies the attribute-based high-pass filter and the topology-based low-pass filter. This helps to address the over-smoothing problem while maintaining effectiveness. Extensive experiments demonstrate that CSF not only successfully alleviates the over-smoothing problem but also promotes the effectiveness of the node classification task.

graph, information, node, (16 more...)

2401.14876

Country:

North America > United States > Texas (0.05)
Asia > China > Sichuan Province > Chengdu (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)